Improving Bayesian Reinforcement Learning Using Transition Abstraction

نویسنده

  • Daniel Acuna
چکیده

Bayesian Reinforcement Learning (BRL) provides an optimal solution to on-line learning while acting, but it is computationally intractable for all but the simplest problems: at each decision time, an agent should weigh all possible courses of action by beliefs about future outcomes constructed over long time horizons. To improve tractability, previous research has focused on sparsely sampling possible courses of action that are most relevant to computing value; however, sampling alone does not scale well to larger environments. In this paper, we investigate whether an abstraction called projects— parts of the transition dynamics that bias the look ahead to areas of the environment that are promising—can scale up BRL to larger environments. We modify a sparse sampler to incorporate projects. We test our algorithm on standard problems that require effective exploration–exploitation balance and show that learning can be significantly sped up compared to a simpler BRL and the classic Q-learning.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using trajectory data to improve bayesian optimization for reinforcement learning

Recently, Bayesian Optimization (BO) has been used to successfully optimize parametric policies in several challenging Reinforcement Learning (RL) applications. BO is attractive for this problem because it exploits Bayesian prior information about the expected return and exploits this knowledge to select new policies to execute. Effectively, the BO framework for policy search addresses the expl...

متن کامل

Learning Qualitative Markov Decision Processes Learning Qualitative Markov Decision Processes

To navigate in natural environments, a robot must decide the best action to take according to its current situation and goal, a problem that can be represented as a Markov Decision Process (MDP). In general, it is assumed that a reasonable state representation and transition model can be provided by the user to the system. When dealing with complex domains, however, it is not always easy or pos...

متن کامل

Proceedings of the ICML / UAI / COLT Workshop on Abstraction in Reinforcement Learning

Bayesian Reinforcement Learning (BRL) provides an optimal solution to on-line learning while acting, but it is computationally intractable for all but the simplest problems: at each decision time, an agent should weigh all possible courses of action by beliefs about future outcomes constructed over long time horizons. To improve tractability, previous research has focused on sparsely sampling p...

متن کامل

A Core Task Abstraction Approach to Hierarchical Reinforcement Learning: (Extended Abstract)

We propose a new, core task abstraction (CTA) approach to learning the relevant transition functions in model-based hierarchical reinforcement learning. CTA exploits contextual independences of the state variables conditional on the taskspecific actions; its promising performance is demonstrated through a set of benchmark problems.

متن کامل

Self-Organizing Perceptual and Temporal Abstraction for Robot Reinforcement Learning

A major current challenge in reinforcement learning research is to extend methods that work well on discrete, short-range, low-dimensional problems to continuous, highdiameter, high-dimensional problems, such as robot navigation using high-resolution sensors. We present a method whereby an robot in a continuous world can, with little prior knowledge of its sensorimotor system, environment, and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009